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Abstract 


Evolution has devised countless remarkable solutions to diverse challenges. Understanding the mechanistic basis 
of these solutions provides insights into how biological systems can be subtly tweaked without maladaptive conse- 
quences. The knowledge gained from illuminating these mechanisms is equally important to our understanding of 
fundamental evolutionary mechanisms as it is to our hopes of developing truly rational plant breeding and synthetic 
biology. In particular, modern population genomic approaches are proving very powerful in the detection of candidate 
alleles for mediating consequential adaptations that can be tested functionally. Especially striking are signals gained 
from contexts involving genetic transfers between populations, closely related species, or indeed between kingdoms. 
Here we discuss two major classes of these scenarios, adaptive introgression and horizontal gene flow, illustrating 


discoveries made across kingdoms. 
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Introduction 


Whether it is gradual or sudden, all organisms face change. 
Adaptive responses are therefore required for survival, 
especially in species that cannot migrate. The footprints of 
these responses are found throughout the genome, serving 
as powerful signals that tell us how populations have over- 
come hazards, both biotic and abiotic. The allelic changes 
at loci mediating adaptive changes are coming to light in 
rapidly increasing numbers of studies, and thanks to ongo- 
ing developments in population genomics, descriptions of 
these loci appear in remarkably high resolution. As a result, 
there is now very good evidence that diverse sources of 
genetic variation underlie important phenotypic changes in 
wild populations. Among these, introgression is emerging 


as a widespread fundamental evolutionary force. The term 
‘introgressive hybridization’, hereafter referred to as ‘intro- 
gression’, was introduced by Anderson and Hubricht (1938). 
They referred to the introduction of syntenic nucleotide vari- 
ation by recombination from a donor species into the genome 
of a recipient species, usually by means of hybridization and 
backcrossing. We will use the terms 'introgression' and 'gene 
flow' as synonyms, also in cases when the units that exchange 
variants are populations of the same species. Largely context- 
dependent, introgression is influenced by an array of ecologi- 
cal factors that control the degree of contact between species. 
These biotic or abiotic factors drive selection for or against 
hybrid genotypes and can lead to complex patterns of genetic 
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admixture (Hand ef al., 2015). This selection has important 
consequences: adaptive introgression commonly results in 
local adaptation to particular geographically distributed con- 
ditions and/or speciation (Dobzhansky, 1937; Mayr 1942; 
Coyne and Orr, 2004). 

Here we discuss the three dominant sources of genetic 
variation and their relative contributions to adaptation. We 
show how genomic approaches are revolutionizing the dis- 
covery of adaptive alleles involved in natural solutions to 
diverse challenges. We highlight how the studies of introgres- 
sion and speciation are linked, with speciation commonly 
occurring in the face of gene flow. This can create hallmark 
genomic architectures that facilitate the discovery of adap- 
tive alleles, which can lead to ‘genomic islands of diver- 
gence’ that are resistant to gene flow. We argue that the novel 
merger of hybrid zone analysis and whole genome popula- 
tion resequencing is a powerful novel tool to detect variants 
mediating diverse adaptations. We provide an overview of 
important approaches taken to identify introgressed alleles 
that can be applied by researchers working on virtually any 
system and that have the potential to unambiguously identify 
strong candidates for adaptively introgressed alleles. Finally, 
we highlight horizontal gene transfer (HGT), a special case 
of adaptive introgression between species with established 
reproductive boundaries. 


Maladaptive introgression 


Before discussing introgression generally and adaptive intro- 
gression in particular, it is important to note that not all intro- 
gression is adaptive; indeed, introgression is a powerful force 
and can be strongly disruptive, particularly as the result of 
human activity. Introgression even has the potential to hin- 
der conservation attempts (Allendorf et al, 2001; Edmands, 
2007). Due to a shortage of conspecific mates (Vaz Pinto et 
al., 2016), introgression may drive rare species to extinction 
by genetic swamping (Wolf et al., 2001; Gomez et al., 2015; 
Todesco et al., 2016). Conservation attempts based on the 
translocation of species to reserves outside native ranges can 
result in introgression and inadvertent admixture that dam- 
ages the biodiversity of the protected species (e.g. antelope; 
van Wyk et al., 2017). In addition, introgression between crops 
or domesticated animals and wild relatives can alter fitness- 
related traits, such as disease resistance and growth, although 
every case is unique. Recent examples include introgression 
from domesticated dogs into wolves (Anderson et al., 2009), 
between farmed and native salmonids (Glover ef al., 2013; 
Ozerov et al., 2016; Karlsson et al., 2016), from wild pigs into 
domesticated pigs (Ai ef al., 2015), from wildcats into domes- 
ticated cats (Ottoni et al., 2017), between maize and teosinte 
(Hufford et al., 2012; Hufford et al., 2013), from domesticated 
rice into wild rice (Wang ef al., 2017), and between genetically 
modified plants and their wild relatives (den Nijs et al., 2004). 
Introgression between native and introduced, often invasive, 
species has also been reported, with examples including mus- 
sels (Saarman and Pogson, 2015), salamander (Fitzpatrick et 
al., 2010; Wilcox et al., 2015), trout (Hohenlohe et al., 2013; 


Muhlfeld et al, 2014; Kovach et al., 2015; Kovach et al, 
2016) and Ulmus trees (Zalapa et al., 2009). Genomes may be 
resistant to the introgression of invading alleles when selec- 
tion favours the native allele, as shown in, for example, trout 
(Kovach et al., 2015; Kovach et al., 2016) and Arabidopsis 
thaliana (Lee et al., 2017). Interestingly, adaptive introgres- 
sion can also occur differentially from one subgenome of an 
allopolyploid, for example from wheat into Aegilops (Parisod 
et al., 2013). 


Introgression as an engine of adaptive 
genetic variation 


There are three primary sources of genetic variation: (1) pre- 
existing or ‘standing’ variants, which are the variants already 
present in a population, (2) new mutations, and (3) introgres- 
sion (reviewed in Olson-Manning et al., 2012; Hedrick, 2013). 
Despite introgression traditionally being seen as maladaptive, 
there is growing literature demonstrating the widespread 
occurrence of adaptive introgression (reviewed in Mallet e¢ 
al., 2016). Introgression is implicated to be a powerful adap- 
tive force in a wide array of taxa, including, for example, the 
malaria transmitting mosquito Anopheles (Clarkson et al., 
2014; Fontaine et al., 2015), Heliconius butterflies (Zhang et 
al., 2016), mice (Song et al., 2011), humans (Racimo ef al., 
2017), Arabidopsis (Arnold ef al., 2016), sunflowers (Whitney 
et al., 2015), and monkeyflowers (Stankowski and Streisfeld, 
2015). However, the relative importance of introgression, de 
novo mutation, and standing variation is far from resolved. 
Barton (2001) concluded that adaptive variation engendered 
by mutation is likely to exceed that brought about by intro- 
gression. This is all the more likely if the effective population 
sizes are large because the probability of a favourable muta- 
tion is a function of effective population size. For example, 
many pests and weeds have tremendous effective popula- 
tion sizes and there is strong evidence that cases of escape 
from chemical insecticides and herbicides have originated 
many times from independent novel mutations, for example 
in Drosophila (Karasov et al., 2010) and in weeds (Délye et 
al., 2013). However this is not always the case, for example 
Anopheles traits enhancing vectorial capacity, including the 
knockdown resistance mediated by a specific single nucleo- 
tide polymorphism, were transferred between two hybridizing 
species (Weill et al., 2000). 

Anderson (1949) suggested that introgressed variation 
should have a higher initial frequency than new adaptive 
mutations and a lower initial frequency than standing varia- 
tion. But if introgression is recurrent and results in fit progeny, 
early frequencies could be much higher, exceeding the adaptive 
potential of standing variation. Further, the impact of single 
introgressed variants on the genome can be sizable, typically 
causing multiple changes within a gene and at times affecting 
several gene-coding loci. Striking examples involve the transfer 
of entire complex adaptations via cassettes of multiple linked 
mutations, such as those in loci that control wing colour pat- 
terns for both mimicry and mate recognition in Heliconius but- 
terflies (The Heliconius Genome Consortium, 2012). Such an 
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advantage can also be found in some cases of adaptive stand- 
ing variation (Bastide et a/l., 2016) and occasionally for de novo 
adaptive mutations (Karasov et al., 2010). In addition, intro- 
gressed alleles feature the benefit that they are likely to have 
been pre-tested by selection in a usually closely related donor 
species and would therefore be less likely to be deleterious than 
random mutations (Hedrick, 2013). Indeed, introgressed vari- 
ants that initially have no strong advantage or disadvantage 
can accumulate in the genome as cryptic variation, which then 
serves as the raw material for selection when conditions change 
(reviewed in Paaby and Rockman, 2014). 

The ability to introgress is dependent on the degree of diver- 
gence, as introgression between more divergent species is usu- 
ally impeded by pre- and post-zygotic reproductive barriers. It 
has recently been suggested that polyploidy can occasionally 
rescue introgression between otherwise reproductively isolated 
species. For example, polyploidization re-established normal 
endosperm cellularization and enabled unidirectional inter- 
ploidal introgression and bidirectional introgression between 
tetraploids of Arabidopsis arenosa and Arabidopsis lyrata 
(Lafon-Placette et al, 2017). A case of introgression between 
the autotetraploid cytotypes of these species reports the 
exchange of candidate alleles for mediating adaptation to highly 
challenging serpentine soils (Arnold ef al., 2016). Interploidal 
introgression is generally assumed to be unidirectional, from 
diploid to polyploid (Stebbins, 1971), although there is evi- 
dence that it can also occur in the reverse direction (Ramsey 
and Schemske, 1998). There have been very few genomic stud- 
ies on interploidal introgression and nearly all reports detail 
gene flow from diploids to tetraploids, for example in Betulus 
(Zohren et al., 2016). Another study in Miscanthus favoured 
the same unidirectionality of gene flow, although there was 
some evidence for occasional gene flow from polyploids to 
diploids (Clark et al., 2015). Due to the scarcity of genomic 
studies one is left with literature based on only handfuls of 
molecular markers, the vast majority of which support gene 
flow from diploids to polyploids, such as in Senecio (Kim et al., 
2008; Chapman and Abbott, 2010), Epidendrum (Pinheiro et 
al., 2010), and Capsella (Han et al., 2015). Except for Kim et al. 
(2008) and Chapman and Abbott (2010) who found evidence 
of introgression of fitness-related genes, it is unclear whether 
interploidal introgression frequently leads to the transfer of 
adaptive alleles. 

Hybridization, particularly in cases where hybridization 
leads to allopolyploidy, and introgression can also introduce 
variation at the structural level leading to genome rearrange- 
ments and novel gene regulatory pathways through the reac- 
tivation of dormant transposable elements (TEs) (reviewed 
in Fontdevila, 2005). For instance, introgression between 
cultivated and wild rice led to changes in transcription levels 
and DNA methylation patterns in TE-rich genomic regions 
(Liu et al., 2004). Introgression between bread wheat and 
tall wheatgrass led to various genetic and epigenetic changes, 
including deletions, differences in gene expression and TE 
reactivation (Liu et al, 2015). Similar changes have been 
identified in introgression lines between cauliflower (Brassica 
oleracea) and black mustard (Brassica nigra), creating vari- 
ation that might prove useful for trait selection in breeding 
(Wang et al., 2016). 
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Introgression aids adaptive allele discovery 
in the genomic era 


The widespread occurrence of introgression suggests promise 
in mining for adaptive alleles in natural populations that have 
overcome identifiable biotic or abiotic challenges. In parallel, 
there is a growing interest in how knowledge obtained from 
more broadly observing natural solutions devised by evolution 
might inform breeding efforts. Mining for introgressed alleles 
between natural populations that are native to contrasting envi- 
ronments offers the possibility to identify candidate alleles that 
mediate definable adaptations. Indeed, it has been long recog- 
nized that especially where fitness-related traits and genotypes 
show clinal variation, gene-environment associations can pro- 
vide a window on the mechanisms of natural selection (Endler, 
1977). Such information is virtually impossible to discover 
with experimental crosses, although evolve and resequence 
experiments can be informative (reviewed in Long et al., 2015). 
However, mining for adaptive alleles in natural populations 
with population genomics, namely ‘reverse ecology’ (Li et al., 
2008), complements and in many ways surpasses candidate 
gene-based approaches. By providing a genome-wide view 
of the divergence landscape, population genomic studies can 
overcome key limitations of candidate-based approaches; for 
instance, they can detect polygenic adaptation provided mark- 
ers are dense, that is to say genome resequencing as opposed to 
restriction site associated DNA sequencing. Such mining for 
co-evolved alleles involved in a trait is facilitated if frequencies 
of these alleles follow a clinal pattern in a hybrid zone that may 
coincide with an environmental gradient. 

Genome-wide data from populations that are characterized 
by a history of extensive gene flow can provide detailed insights 
into the genetic basis of adaptive divergence. Usually in such 
cases only a few loci under selection rise above the neutral 
background that is homogenized by gene flow. Here we focus 
on these population genomic studies, because they can leverage 
sufficient resolution and statistical power to detect non-ran- 
dom patterns of introgression. We note, however, that crucial 
insights into the evolutionary role of adaptive introgression 
were made before the advent of high-throughput sequencing 
and population genomic datasets (reviewed in Arnold and 
Martin, 2009). Indeed, population genomics and comparative 
phylogenomics are young fields and sometimes issues with data 
quality or analytical rigor can raise concerns that render results 
ambiguous (Brower, 2013; Wen et al., 2016). Nevertheless, with 
well-designed sampling and increasingly sophisticated analy- 
sis, evidence of genomic heterogeneity in patterns of gene flow 
has been accumulating at all levels from weakly (Arnold et al., 
2016) to highly differentiated (Pardo-Diaz et al., 2012) species. 
Understanding these population- and genome-wide patterns 
of introgression in species with diverse adaptations has broad 
potential to contribute to our understanding of the fundamen- 
tal mechanisms of adaptation. 


The genomic architecture of introgression 


The study of introgression in natural populations is deeply 
connected with the study of speciation. It is increasingly recog- 
nized that speciation frequently occurs in the face of gene flow; 
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either through continual gene flow or phases of secondary con- 
tact in species that have already established partial reproduc- 
tive isolation, hereafter referred to as parapatrically isolated 
species (PaIS) (Nosil, 2008; Feder et al, 2012; Abbott et al., 
2013; Harrison and Larson, 2014; Mallet et al, 2016; Shapiro 
et al., 2016; Arnold, 2016). One iconic example is the sunflower 
Helianthus annuus subsp. texanus, which originated as a result 
of introgression and shows different adaptations compared 
with its parents, mainly abiotic tolerance traits (Whitney et al., 
2010; Whitney et al, 2015). It may be difficult to discriminate 
ongoing divergence from secondary contact in PaIS (Endler, 
1977; Harrison, 2012; Gompert and Buerkle, 2016), especially 
in cases where species have recently diverged. Both scenarios 
of introgression offer potential insights into the mechanisms 
of adaptive introgression, although major findings have been 
made in cases of secondary contact (Table 1), as speciation- 
with-gene-flow studies usually aim to identify loci involved in 
the establishment and maintenance of reproductive isolation 
and not loci involved in local adaptation. Strong evidence 
for secondary contact in major systems that allow the study 
of adaptive introgression comes from clear estimates of the 
time of secondary contact as well as the delineation of clearly 
defined hybrid zones. Both the time of species divergence and 
secondary contact are relatively young in these systems (Table 
1). Except for the hybridization between Heliconius erato and 
H. melpomene (Martin et al., 2013; Kozak et al., 2015) and 
Saccharomyces cerevisiae and S. uvarum (Kellis et al., 2003), all 
other study systems show divergence times for the introgressing 
species of usually much less than 2 mya and times of secondary 
contact between a few hundred and few thousand years before 
the present time. 

Introgression between divergent populations or closely 
related species does not generally homogenize divergence lev- 
els across the entire genome (Turner et al., 2005; Coleman et al., 
2006; Harr, 2006; Michel et al., 2010; Hohenlohe et al., 2012; 
Via, 2012; Renaut et al., 2013; Malinsky et al., 2015; Marques 
et al., 2016; reviewed in Wolf and Ellegren, 2017). Genomic 
islands of divergence (GIsD), also known as genomic islands 
of speciation, manifest as regions of elevated divergence in a 
‘sea’ of neutral, non-differentiated background that has been 
homogenized by a history of gene flow. These GIsD are typi- 
cally attributed to loci under divergent selection contributing 
to adaptation to the local environment or the establishment of 
reproductive isolation relatively independent of the external 
environment (Wu and Ting, 2004; Orr et al., 2004; Rieseberg 
and Blackman, 2010; Nosil and Schluter, 2011). One example 
is the adaptation to different foraging behaviours in Darwin’s 
finches, which might have largely been driven by the ALX1 
gene that encodes a transcription factor affecting craniofa- 
cial development. Variation in this gene is strongly associ- 
ated with beak shape diversity across Darwin’s finches and 
the medium ground finch (Lamichhaney et al., 2015). GIsD 
have been reported in many cases of recent species divergence 
with ongoing gene flow (Martin et al, 2013; Jonsson et al., 
2014; Supple et al., 2015; Rougeux et al., 2016; Royer et al., 
2016; Morales et al., 2017; Kumar et al., 2017). The region 
of divergence typically extends away from the selected locus 
due to physical linkage, allowing neutral polymorphisms to 


hitchhike along with a selected polymorphism, namely diver- 
gence hitchhiking, and become part of the GIsD. In many 
cases, this can obscure signals of selection and lead to ambi- 
guity over the exact genetic loci under selection. It is therefore 
advisable to incorporate linked selection as a null model for 
the identification of genomic regions exhibiting pronounced 
differentiation (Burri et al., 2015). 

Low recombination rates in some genomic regions can 
also create GIsD. Regions of suppressed recombination 
are frequently pericentromeric and often involve structural 
changes such as inversions that can physically block recom- 
bination. They are thought to sometimes play a role in adap- 
tive genomic divergence (Rieseberg, 2001; Noor et al., 2001; 
Feder and Nosil, 2009), for example in the case of the 2L 
inversion divergence island, which was introgressed from 
Anopheles gambiae into A. coluzzii and harbours a suite of 
insecticide-resistance alleles (Lee et al., 2013a; Norris et al, 
2015). However, it is questionable if most regions of very 
low recombination contain loci under divergent selection or 
whether they were simply established due to the stochastic 
effects of genome structure and genetic drift (Turner and 
Hahn, 2010). If the amount of gene flow is very low, for 
example due to geographic isolation, then large blocks of 
pronounced genomic differentiation may arise from genetic 
drift. This could be further accentuated by variable mutation 
rates and low recombination rates (Noor and Bennett, 2009; 
White et al., 2010; Cruickshank and Hahn, 2014). 


Traits impacted by adaptive introgression 


The traits impacted by introgression are as diverse as the 
organisms that have been studied (Table 1). In plants, for 
example, a broad array of abiotic tolerance traits have been 
affected, for example ion homeostasis and drought adapta- 
tions (Arabidopsis arenosa, Arnold et al., 2016; sunflowers, 
Whitney et al., 2015), but also traits that control biotic inter- 
actions between plants and herbivores (sunflowers, Whitney 
et al., 2015) or pollinators (Senecio, RAY] and RAY2 genes, 
Kim et al., 2008), and others (poplars, PRR5 and COMTI 
genes, Suarez-Gonzalez et al., 2016). In animals, clear exam- 
ples that include traits influenced by adaptive introgression 
include insecticide resistance in Anopheles, the single nucleo- 
tide polymorphism L1014F in the gene kdr (Fontaine et al., 
2015; Norris et al., 2015) and rodenticide resistance in mice, 
the vkorcl gene (Song et a/., 2011). Other introgressed alleles 
of potentially adaptive value stem from the actin gene mac- 
7 in mussels (Fraisse et al., 2014), several loci that control 
wing colour patterns for both mimicry and mate recogni- 
tion, amongst them the optix locus in Heliconius butterflies 
(Nadeau et al., 2014; Zhang et al., 2016), the beak shape- 
associated locus ALX/ in Darwin’s finches (Lamichhaney 
et al., 2015), variants that control lipid metabolism, pig- 
mentation and innate immunity in humans (Racimo et al., 
2017), but also loci of so far uncharacterized adaptive value 
in Drosophila (Garrigan et al., 2012; Brand et al., 2013) and 
mice (Staubach ef a/., 2012). An interesting case of intro- 
gression between Drosophila species was reported to be 
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trans-specific, meaning that the introgressed 15 kb region 
harbours a mutation that appears adaptive in both species 
despite their different genomic backgrounds and ecological 
requirements (Brand ef al., 2013). In addition, cytonuclear 
co-introgression has been shown for some Drosophila spe- 
cies (Beck et al., 2015). In the fungal genus Saccharomyces 
experimental hybridisation followed by ammonium limita- 
tion resulted in the origin of an interspecific MEP2 fusion 
gene, which encodes for a high affinity ammonium permease 
that may be adaptive under nitrogen-poor environments with 
ammonium as the only nitrogen source (Dunn ef al., 2013). 


Adaptive allele mining in hybrid zones 


In secondary contact scenarios, PaIS come into contact fol- 
lowing a period of reproductive isolation. This process is 
often induced by geographic and environmental factors, such 
as habitat shifts due to climate changes, for example oscilla- 
tions during Pleistocene and Anthropocene, or catastrophic 
human-mediated events such as fires, flooding, road construc- 
tion or introduction of alien species (Crispo et al., 2011). Such 
secondary contact between divergent but interfertile popula- 
tions frequently results in hybrid zones where the otherwise 
geographically distinct distribution ranges of the two species 
overlap, permitting the production of offspring of mixed 
ancestry (Barton and Hewitt, 1985; Barton and Hewitt, 1989; 
Harrison, 1990). In the fungus-like pathogen A/bugo candida 
a spectacular example of adaptive introgression after second- 
ary contact has recently been reported, in which infection 
of a host with virulent 4. candida suppresses host immunity 
enabling co-colonization by otherwise non-virulent races 
(McMullan et al., 2015). This creates hybrid zones allowing 
sexual reproduction and gene flow between isolated races; 
effector alleles are transferred that exhibit a fitness advantage 
on many potential hosts, facilitating host jumps. Then once a 
new hybrid race has established, it rapidly reproduces asexu- 
ally without continued exchange with other races. 

Hybrid zones represent natural laboratories (Hewitt, 1988) 
for the study of introgression dynamics as well as their evolu- 
tionary significance. They offer venues to observe allele flow 
between partly isolated populations and allow hybrid fitness 
to be estimated in nature. As described above, alleles under 
divergent selection are relatively inhibited from introgressing 
between populations. This barrier to the exchange of certain 
alleles generates an allele frequency gradient, or cline, across 
the hybrid zone, which partly blocks the homogenization of 
the hybridizing populations. Such a cline might be detected 
as a phenotypic transition from one parental species to the 
other, which should coincide with the underlying allelic cline. 
Hybrid zones can be understood as natural mapping experi- 
ments to determine the alleles responsible for complex traits 
that differentiate parental populations (Buerkle and Lexer, 
2008; Crawford and Nielsen, 2013). Over generations, recom- 
bination breaks down linkage, allowing alleles at discrete loci 
to be associated with phenotypes and environmental variables 
(Crawford and Nielsen, 2013). This is especially clear in large 
hybrid populations with varying levels of admixture that are 
established along environmental gradients, such as in spruce 
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Table 2. Useful population genomic and phylogenomic approaches to detect and interpret introgression events 





Analysis Type Software Description 


References 





Demographic inference Population structure 


adegenet 


Evaluation of population clustering using 


Jombart, 2008; Jombart et a/., 2010 


principal component and discriminant analysis 


STRUCTURE; 
fastSTRUCTURE; 
fineSTRUCTURE; 
ADMIXTURE; 
SpaceMix 
Demographic history 
M; IMa2 





dadi (diffusion approximations 
or demographic inference) 
PSMC (pairwise sequentially 
arkovian coalescent model); 
SMC (multiple sequentially 
arkovian coalescent model) 
simcoal2; fastsimcoal2 








Rarecoal 
rare variants 
Network and tree methods 
NeighbourNet (Splits Tree4) 
networks 
TreeMix 
Phylo-Net 


Estimation of (fine-scale) population structure, 
taking admixture into account 


Fitting an isolation with migration model to haplotype data 
rom two populations (IM) or up to ten populations (IMa2) 
nference of the demographic history of multiple 
populations from SNP frequency data 

nference of fluctuations in effective population size over 
ime from a single genome sequence (PSMC) or multiple 
genome sequences (MSMC) 


nference of parameter values, such as population split 
imes and migration rates, and testing of hypotheses to 
compare to alternative, neutral demographic scenarios 
nference of population history and fine-scale ancestry from 


Visualization of reticulate relationships in the form of splits 


Inference of patterns of population splits and admixture 
Coalescent-based species tree and evolutionary network 


Pritchard et al., 2000; Hubisz et a/., 2009; 
Raj et al., 2014; 

Lawson et al., 2012; 

Alexander et al., 2009; 

Bradburd et al/., 2016 





ley and Nielsen, 2007; Hey J. 2010a; 
ley J. 20106 
Gutenkunst et al., 2009 








Li and Durbin, 2011; Schiffels and 
Durbin, 2014 





Laval and Excoffier, 2004; Excoffier 
et al., 2013 


Schiffels et a/., 2016 


Huson and Bryant, 2006 


Pickrell and Pritchard, 2012 
Than et al., 2008 


method; testing the number of introgression events 


Genomic architecture 
of introgression 


Introgression detection 
D statistic; f statistic 


taxon case 


Dro. Statistic 


Detection of SNP window-based or genome-wide 
evidence of shared alleles (ABBA-BABA test) in a four- 


D statistic for a symmetric five-taxon phylogeny; 


Comparison of two genomes: Green 
et al., 2010; Durand et al., 2011 

/ comparison of two populations: 
Kronforst et a/., 2013; Smith and 
Kronforst, 2013 

Pease and Hahn, 2015 


determination of the directionality of introgression 


fp statistic 


Refinement of the f statistic (Green et a/., 2010) by being 


Martin et a/., 2015 


less sensitive to differences in diversity along the genome 


Rp, U, Q95 statistics 


Identification of genomic windows that are likely to have 


Racimo et al., 2017 


undergone adaptive introgression 


Methods with additional visualization aspect 


Detection of introgressed genomic blocks and visualization 


Ward and van Oosterhout, 2016 


of the heterogeneous, mosaic-like genome structure; 


Plot of haplotype structure at candidate regions for 


Marnetto et al., in prep. (cited in Racimo 
et al., 2017) 


HybridCheck 

dating of introgressed blocks 
Haplostrips 

adaptive introgression 
Twisst 


genome 


(Hamilton et al, 2013; Hamilton et al, 2015). Substantial 
levels of clinal variation and allele-environment associations 
with climatic variables such as temperature and precipitation 
were found in hybrid zones of Sitka, white, and Engelmann 
spruce (Hamilton et al., 2015), which suggests that species 
integrity is maintained through exogenous selection in paren- 
tal habitats and that hybridisation might facilitate fine-scale 
adaptation of the species along environmental gradients. 


Topology weighting of SNP window-based trees across the 


Martin and Van Belleghem, 2017 


However, genotype-phenotype-environment associations 
along clines have rarely been addressed using dense genome- 
wide markers. Highlights include work in mice (Turner and 
Harr, 2014; Pallares et al., 2014; Pallares et al., 2016), for 
which genes involved in craniofacial shape variation were 
found that act in a polygenic manner (Pallares et al., 2016). 
The Heliconius and Helianthus systems are other exam- 
ples of association mapping across hybrid zones, although 
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Helianthus showed spurious associations in early generation 
hybrids due to linkage disequilibrium (Rieseberg and Buerkle, 
2002). Heliconius in contrast is a relatively old hybrid zone 
and close to linkage equilibrium, with phenotypic variation 
largely controlled by major effect loci, for which smaller sam- 
ple sizes are sufficient (Nadeau ef al., 2014). With emerging 
probabilistic frameworks to infer allele frequencies in low 
coverage sequencing data it should be possible in the future 
to also address minor effect loci, for which larger sample sizes 
are required, even in polyploids (Blischak et al., 2017). 
Selective landscapes vary widely in different types of 
hybrid zones. If selection acts on genotypes of mixed ances- 
try independent of spatial variation in selection pressures, the 
hybrid zone can be considered a ‘tension zone’ (Barton and 
Hewitt, 1985; Barton and Hewitt, 1989). In a tension zone 
selection is usually endogenous, for example the hybrid zone 
between two Senecio species on Mount Etna (Brennan ef al., 
2009), with occasional exceptions such as exogenous fre- 
quency-dependent selection by predatory birds in Heliconius 
butterflies (Mallet et al, 1990). Alternatively, Endler (1977) 
postulated spatially varying exogenous selection along a 
geographic-environmental gradient. According to another 
model, the bounded hybrid superiority model (Moore, 1977), 
selection in intermediate habitats can favour individuals of 
mixed ancestry, which seems to be the case for the hybrid 
zone between Sitka and white spruce (Hamilton et a/., 2013). 
In mosaic hybrid zones, parental populations are distributed 
in a patchy landscape. Examples of such hybrid zones are fre- 
quently found in parasites that rely on patchily distributed 
hosts, for example in Albugo candida (McMullan et al., 2015), 
but also in non-parasitic species such as mussels (Fraisse 
et al., 2014), which may reflect a common scenario of com- 
plex environmental mosaics in nature (Harrison, 1986). In a 
tension zone, with endogenous selection on the hybrids, the 
hybrid zone can move freely if there are asymmetries in selec- 
tion, dispersal, or population density and will finally arrest 
at a geographic barrier or in an area of low population den- 
sity (Hewitt, 1975; Barton, 1979; Barton and Hewitt, 1985). 
Hybrid zones might move in response to climate change or 
co-varying environmental factors (Taylor et al., 2015). 


Population genomic approaches to infer 
introgression in natural populations 


A convergence of recent work on new methods of refined 
demographic inference and methods to describe the genomic 
architecture of introgression have great potential for the dis- 
covery of introgressed alleles (Table 2). Demographic infer- 
ence methods evaluate population structure. They provide a 
good estimate of admixture between populations. They esti- 
mate fine-scale demographic histories, such as population his- 
tory, including fluctuations in effective population size over 
evolutionary time, partly based on rare polymorphisms, or 
they infer parameter values, such as population split times 
and migration rates, and test hypotheses for comparison with 
alternative demographic scenarios. The genomic architecture 
of introgression is observed using metrics that determine the 


level of allele sharing and linkage disequilibrium, along with 
visualization methods, including topology weighting of trees 
across the genome (Table 2). Genome scans assay for regions 
exhibiting unusual levels of divergence among populations or 
species that represent candidate selected alleles (Nielsen et al., 
2005; Yant et al., 2013; Lotterhos and Whitlock, 2015; Jensen 
et al., 2016). There are two main approaches to identify 
candidate loci: taking outliers of differentiation metrics or 
performing explicit hypothesis testing to determine whether 
the value is significantly greater than expected by chance or 
under neutrality. Genome scans can be confounded by other 
modes of selection and by effects of demography and are 
therefore best interpreted as hypothesis generators, providing 
candidate loci and processes to be tested in downstream func- 
tional studies, where possible. It may also be more confidently 
concluded that introgression is adaptive if a combination of 
methods, including descriptors of the genomic architecture 
of introgression and appropriately chosen selection metrics, 
converge on particular loci. In some cases, compelling a priori 
evidence that the trait is under similar directional selection in 
both species is also available. The combination of using meth- 
ods assessing the genomic architecture of introgression and 
population genomics metrics was successfully applied when 
mining for adaptively introgressed candidate alleles for medi- 
ating adaptation to serpentine soils in Arabidopsis (Arnold et 
al., 2016). In Heliconius butterflies there was a priori evidence 
for wing colour patterns involved in mimicry and mate rec- 
ognition to be under directional selection in the hybridising 
species, and it was then shown that the introgressed alleles are 
responsible for this trait (Pardo-Diaz et al., 2012). 


Horizontal gene transfer events as a 
source of adaptive novelty 


Alleles conferring an adaptive advantage may also be 
exchanged between reproductively isolated species via HGT. 
In prokaryotes HGT between distantly related species is well 
established as a source of novelty and a key driver of adapta- 
tion, most notably in the acquisition of antibiotic resistance 
and genes conferring pathogenicity (Gillings, 2017). Similarly, 
HGT has been shown to be prevalent among single-celled 
eukaryotes (Keeling and Palmer, 2008; Andersson, 2009). 
This penchant for HGT in prokaryotes has an ongoing effect 
on multicellular eukaryotes; HGT between the mitochon- 
dria of distantly related plant species is rampant (Won and 
Renner, 2003; Bergthorsson et al., 2003; Mower et al., 2010; 
Rice et al., 2013;). HGT in the plastid genome seems rela- 
tively infrequent, although there are cases (Rice et al., 2006; 
Park et al., 2007). This may be due to the fact that plant mito- 
chondria have a mechanism for the active uptake of DNA 
and frequently fuse (Richardson and Palmer, 2007; Rice et 
al., 2013), while plastids lack this tendency. HGT into the 
nuclear genomes of multicellular eukaryotes is less frequent. 
For any foreign DNA to be heritable it must be integrated 
into the germline, which is separated from the somatic cells 
and often protected from the environment by elaborate struc- 
tures. Despite this, HGT does occur in the nuclear genomes 
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of multicellular eukaryotes; fungi (Ambrose ef al., 2014), 
arthropods (Wybouw ef al., 2016), nematodes (Danchin 
et al., 2010), mosquitos (Klasson et al., 2009), fish (Sun et 
al., 2015), sea anemones (Starcevic et al., 2008) and a broad 
range of plants (Bock, 2010) have all been shown to contain 
nuclear genes of HGT origin from diverse sources. 

Many possible mechanisms for genetic transfers by HGT 
have been suggested. Lifestyle traits that allow intimate con- 
tact between unrelated species may increase the likelihood 
of HGT. For example, close contact between a host and its 
parasite, which can involve the exchange of macromolecules 
including mRNAs (Kim ef al., 2014), presents an opportu- 
nity for genetic exchange (Yoshida et al., 2010; Xi et al., 2012; 
Zhang et al., 2013; Zhang et al., 2014; Davis and Xi, 2015; 
Yang et al., 2016) as does a reproductive cycle in which compo- 
nents of the germline are more exposed to the environment or 
even free living, such as those of bryophytes, lycophytes, and 
ferns (Li et al., 2014). It has also been demonstrated that rare 
cases of grafting between unrelated species can result in the 
transformation of cells at the graft site (Bergthorsson, 2003; 
Stegemann, 2009; Stegemann et a/. 2012). There are also many 
vectors that may transport genetic material between species, 
for example bacteria, viruses or mobile genetic elements. It 
has been established that pathogenic bacteria can transform 
eukaryotic host cells through the injection of proteins and/or 
genetic material (Lacroix and Citovsky, 2016). HGT of selfish 
genetic elements, such as transposons, is rampant and could 
mediate the movement of host DNA. Indeed, a study of 
group I introns in angiosperm mitochondria found 32 sepa- 
rate instances of HGT into plants (Cho et a/., 1998). Another 
study showed that 65% of the plant genomes analyzed con- 
tained at least one instance of HGT of a long terminal repeat 
retrotransposon (El Baidouri et a/., 2014). This likely has an 
important evolutionary function for selfish genetic elements, 
allowing them to escape resistant host genomes that effec- 
tively silence them. Finally, despite the complexity involved in 
shielding gametes from the environment in seed plants, it has 
been suggested that exposure to foreign pollen could result 
in small windows of opportunity for illegitimate pollination 
and HGT into the germline (Keeling and Palmer, 2008; Bock, 
2010; Christin et al., 2012). 

A key outstanding question, however, is what, if any, is the 
adaptive impact of these HGT events? Arguably genes that do 
not confer an adaptive benefit are expected to decay by neu- 
tral drift, eventually to be relegated to pseudogene status and 
then lost altogether (Keeling and Palmer, 2008; Soucy, 2015). 
Many factors could render a gene acquired by HGT from 
a divergent organism useless, or even deleterious. A novel 
genetic background could abrogate interactions essential 
for gene function, while incompatible and divergent codon 
biases, transcriptional elements or intron/exon splice sites 
could also inactivate foreign genes. Indeed, the fate of many 
genes acquired by HGT in eukaryotes is decay (Bergthorsson, 
2003; Mower et al., 2010; Rice, 2013 Mahelka et al., 2017). 
However, a large body of evidence suggests that this is not 
always the case (Emiliani et al, 2009; Danchin et al., 2010; 
Yue et al., 2012; Christin et al, 2012; Acuna et al., 2012; 
Zhang et al., 2013; Yang et al., 2013; Li et al., 2014; Ambrose 
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et al., 2014; Prentice et al., 2015; Sun et al, 2015; Yin et al., 
2016). The transcriptional and developmental regulation of 
horizontally transferred loci, alongside evidence for purify- 
ing or positive selection, suggests that many such events are 
adaptive. 

Phylogenetic analysis is the gold standard for HGT infer- 
ence (Keeling and Palmer, 2008; Brock, 2009; Soucy, 2015). 
It relies on detecting incongruence between individual locus 
trees and species phylogenies. Often these cases are obvious. 
However, alternative evolutionary processes as well as sam- 
pling and analytic errors can produce incongruent gene trees 
and these must be considered in each case. Several potential 
confounding causes include gene duplication and subsequent 
loss, inadequate taxonomic sampling, historical allopoly- 
ploidization and long branch attraction (Keeling and Palmer, 
2008; Brock, 2009; Soucy 2015). Further the possibility exists 
of intracellular gene transfer from the mitochondrial and plas- 
tid genomes; thus, genes of alpha-proteobacterial or cyano- 
bacterial origin should be excluded from analysis (Huang and 
Gogarten, 2008; Yang et al., 2013). However, diverse evidence 
classes can corroborate candidate HGT events, such as codon 
usage differences, intron structure or GC content. 


HGT and the emergence of land plants 


Strong evidence implicates ancient HGT as a key source 
of adaptive novelty when the pioneering ancestor of green 
plants adapted to terrestrial environments (Huang and 
Gogarten, 2008; Emiliani et al., 2009; Yue et al., 2012; Yue 
et al., 2013; Yang et al., 2013). This lineage experienced an 
array of novel abiotic and biotic challenges upon colonisation 
of land, including desiccation, UV irradiation, and microbial 
attack. Analysis of the moss Physcomitrella patens identi- 
fied 39 gene families that had been acquired by HGT from 
prokaryotes, fungi or viruses after the split between plants 
and green algae, 35 of which were shared with seed plants. 
These loci are involved in a broad range of plant-specific pro- 
cesses including biosynthesis, defence, stress tolerance, vascu- 
lar development, and seed germination (Yue ef al., 2012; Yue 
et al., 2013). In contrast to seed plants, the gametophytes and 
zygotes of mosses are more exposed to the environment, pre- 
senting an opportunity for the integration of foreign DNA. 
Another example is the phenylpropanoid pathway that pro- 
duces compounds such as lignin and flavonoids, critical com- 
ponents in plant structure and defence against microbes and 
UV (Emiliani et al, 2009). The common ancestor of land 
plants acquired the enzyme phenylpropanoid that performs 
the first critical step in this pathway via HGT from what was 
ultimately a bacterial source (Emiliani ef al., 2009). Another 
example is represented by the L-Ala-D/L-Glu epimerases 
(AEEs), which are ubiquitous in land plants and were ini- 
tially acquired by HGT from prokaryotes (Yang ef al., 2013). 
The fixation of AEEs in land plants was driven by positive 
selection and was specific to land plants; they have not been 
found in any other eukaryotes including the extant progeni- 
tors of land plants, red and green algae (Yang ef al., 2013). 
Acquisition of genes by HGT prior to the split between 
red alga and green plants is also an important source of 
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Box 1: Established and promising model 
systems for adaptive introgression 


It has long been the subject of debate whether introgres- 
sion is more commonin plants than other kingdoms (Mallet, 
2005; Mallet, 2007). Introgression may be less common in 
animals compared with plants due to stronger assorta- 
tive mating and lower F1 hybrid fitness. Interestingly, there 
are more animal than plant studies that find evidence for 
adaptive introgression (see Table 1): for example in ani- 
mals, mussels (Fraisse et al., 2014), Drosophila (Garrigan 
et al., 2012; Brand et al., 2013; Llopart et a/., 2014; Beck 
et al., 2015), Anopheles (Fontaine et a/., 2015; Norris et al., 
2015), Heliconius butterflies (Nadeau et al., 2014; Zhang 
et al., 2016), freshwater fish such as salmonids (Glover et 
al., 2013; Ozerov et a/., 2016; Karlsson et al., 2016), birds 
such as Darwin’s finches (Lamichhaney et a/., 2015), mice 
(Song et al., 2011; Staubach et al., 2012), and humans 
(Racimo et al/., 2017); and in plants, Arabidopsis (Arnold et 
al., 2016), sunflowers (Whitney et al., 2015), Senecio (Kim 
et al., 2008), and poplars (Suarez-Gonzalez et a/., 2016). 
This might be due to stronger economic/medical interest 
in animals, such as freshwater fish and malaria-transmit- 
ting mosquitos, resulting in more intense adaptive allele 
mining. However, this could also be due to higher rates of 
adaptive evolution in organisms with large effective popu- 
lation sizes (Gossmann et a/., 2010), such as Drosophila, 
Anopheles, Heliconius butterflies, some freshwater fish, as 
well as mice. Introgression may provide fertile ground for 
adaptive radiations (Seehausen, 2004), either by enriching 
genetic variation in an initial hybridization event between 
two species that may then fuel radiation or by introduc- 
ing adaptations that allow species of radiating lineages 
to occupy new niches and further diversify. Introgression 
has been shown to partly drive the adaptive radiations of 
some plants, such as Mimulus (Stankowski and Streisfeld, 
2015) and Solanum (Pease et a/., 2016), and some ani- 
mals, such as Darwin’s finches (Lamichhaney et a/., 2015) 
and cichlids (Meier et a/., 2017). Numerous cases of adap- 
tive introgression are also reported for fungi, such as yeast 
(Dunn et al., 2013; Almeida et al., 2017), as well as Albugo 
(McMullan et al., 2015). 








genes involved in the functionality of plastids (Huang and 
Gogarten, 2008). 


HGT mediating pathogen resistance and 
environmental adaptation 


HGT is also a source of novelty that is exploited in patho- 
gen resistance. For example, a gene important for virus resist- 
ance in domestic tomato is derived from the fusion of two 
genes, both of which were acquired by HGT ultimately from 
bacteria, although one of the genes appears to have passed 
through a fungus (Yang ef a/., 2016). The obligatory parasite 


Phelipanche aegyptiaca acquired an albumin I gene, known to 
function as a storage protein and insect toxin, by HGT from 
legumes (Zhang ef a/., 2013). Structural predictions, the con- 
servation of key functional residues, and evidence of purify- 
ing selection all suggest that the gene serves an adaptive role. 
This HGT event was likely the result of historical host-para- 
site interactions. Another example is the fungus Epich/oé, an 
intracellular plant symbiont, which repeatedly was conferred 
insect resistance to its hosts by an insect toxin acquired by the 
fungus via HGT from bacteria (Ambrose ef al., 2014). 

The reverse is also true; HGT in plants, fungi, and insects 
has allowed enhanced pathogenic exploitation of plants. In 
the fungus responsible for apple canker, Valsa mali, multiple 
genes acquired by HGT from bacteria and fungi were impli- 
cated in pathogenicity with putative roles in the avoidance of 
host immune responses and the degradation of host tissues 
(Yin et al., 2016). There is evidence that the berry borer beetle 
Hypothenemus hampei adapted to its role as a pathogen of cof- 
fee beans by the acquisition of the HhMAN/1 locus (Acufia et 
al., 2012). The enzyme encoded is capable of breaking down 
galactomannan, the main storage polysaccharide in coffee 
beans and likely allows the beetles to exploit coffee beans as a 
food source. HGT has allowed plant-parasitic root knot nem- 
atodes to acquire a repertoire of plant cell wall degradation 
enzymes that facilitate parasitism from bacteria (Danchin et 
al., 2010). Finally, a study of HGT in multiple parasitic plant 
lineages showed that genes acquired from the host via HGT 
are not only evolving under purifying or positive selection but 
are most likely to be expressed in the haustorium, the inter- 
face between host and parasite, implying they may have an 
adaptive role in host-parasite interactions (Yang et al., 2016). 

HGT has also been implicated in mediating repeated adap- 
tation to stringent environmental conditions. C4 photosyn- 
thesis is a more efficient photosynthetic pathway in hot arid 
conditions that has evolved multiple times from C3 progeni- 
tors. In the A//oteropsis grasses, two key C4 photosynthesis 
proteins, phosphoenolpyruvate carboxylase and phospho- 
enolpyruvate carboxykinase, have been acquired by HGT 
at least four times from distantly related plants in close eco- 
logical contact with the grasses (Christin et al., 2012). These 
pre-adapted C4 genes likely replaced their suboptimal homo- 
logues, rapidly optimizing the C4 pathway in the recipient. 
A metabolic enzyme with a key role in glucose metabolism 
that was acquired by plant-to-plant HGT has been shown to 
be associated with fine-scale biotic and abiotic environmental 
differences in the grass Festuca ovina (Prentice et al., 2015). 

A particularly striking example of adaptation mediated 
by HGT involves the photoreceptor neochrome, a chimeric 
photoreceptor originating from a gene fusion that is thought 
to play a pivotal role in the enhanced phototropic response 
of ferns, a key adaptive trait that allowed their diversifica- 
tion following the advent of angiosperms (Kawai et al., 2003; 
Schneider et al., 2004; Kanegae et al., 2006; Schuettpelz and 
Pryer, 2009). This protein, previously thought to have arisen 
independently in ferns and algae, was initially acquired by the 
fern lineage through HGT from hornworts followed by sub- 
sequent inter-fern HGT (Li et al., 2014). As discussed above, 
both ferns and hornworts have lifestyle traits that likely make 
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them more susceptible to HGT. Finally, there is evidence 
of HGT in commercially important species: domesticated 
sweet potato contains four transcribed Agrobacterium genes 
(Kyndt et al., 2015) and the silkworm Bombyx mori contains 
10 expressed genes from bacterial sources, with putative func- 
tions in disease resistance and metabolism (Zhu ef al., 2011). 

Thus, HGT has had a profound impact on the genesis of 
many important adaptive plant traits such as the acquisition of 
endosymbionts, the origin of C4 photosynthesis, and the emer- 
gence of terrestrial plants. Compared with prokaryotes, HGT 
in eukaryotes is considered rare but the impact of such HGT 
events on the evolutionary trajectories of their recipients can 
be large. In this era of high-throughput sequencing technolo- 
gies and especially whole genome sequencing, cases of HGT in 
multicellular eukaryotes are increasingly likely to be identified. 


Near-term perspectives on adaptive allele 
mining using adaptive introgression 


When phenotype-driven, allele mining for introgressed loci is a 
powerful tool for the identification of strong candidate alleles 
underlying particular adaptations. Indeed, we are at a water- 
shed moment in the history of these approaches. There stands 
behind us a rich history, with long established model systems 
poised to be married to modern population genomics. It is now 
possible to detect loci under divergent selection, to test if the 
selected alleles have been introgressed and to associate these 
candidate alleles with phenotypes in ultra-high genomic reso- 
lution. In cases where these studies focus on clear phenotypes, 
it is obvious that such mixtures of approaches will engender 
rapid developments in understanding the mechanisms of 
adaptation (Box 1). Further, they stand to reveal the historic 
and geographic context of adaptive introgression mediating 
complex traits in the rich complexity presented by nature. 
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